智能论文笔记

Adversarial attacks and defenses on ML- and hardware-based IoT device fingerprinting and identification

Pedro Miguel Sánchez Sánchez , Alberto Huertas Celdrán , Gérôme Bovet , Gregorio Martínez Pérez

分类：人工智能

2022-12-30

In the last years, the number of IoT devices deployed has suffered an undoubted explosion, reaching the scale of billions. However, some new cybersecurity issues have appeared together with this development. Some of these issues are the deployment of unauthorized devices, malicious code modification, malware deployment, or vulnerability exploitation. This fact has motivated the requirement for new device identification mechanisms based on behavior monitoring. Besides, these solutions have recently leveraged Machine and Deep Learning techniques due to the advances in this field and the increase in processing capabilities. In contrast, attackers do not stay stalled and have developed adversarial attacks focused on context modification and ML/DL evaluation evasion applied to IoT device identification solutions. This work explores the performance of hardware behavior-based individual device identification, how it is affected by possible context- and ML/DL-focused attacks, and how its resilience can be improved using defense techniques. In this sense, it proposes an LSTM-CNN architecture based on hardware performance behavior for individual device identification. Then, previous techniques have been compared with the proposed architecture using a hardware performance dataset collected from 45 Raspberry Pi devices running identical software. The LSTM-CNN improves previous solutions achieving a +0.96 average F1-Score and 0.8 minimum TPR for all devices. Afterward, context- and ML/DL-focused adversarial attacks were applied against the previous model to test its robustness. A temperature-based context attack was not able to disrupt the identification. However, some ML/DL state-of-the-art evasion attacks were successful. Finally, adversarial training and model distillation defense techniques are selected to improve the model resilience to evasion attacks, without degrading its performance.

translated by 谷歌翻译

RL and Fingerprinting to Select Moving Target Defense Mechanisms for Zero-day Attacks in IoT

Alberto Huertas Celdrán , Pedro Miguel Sánchez Sánchez , Jan von der Assen , Timo Schenk , Gérôme Bovet , Gregorio Martínez Pérez , Burkhard Stiller

分类：人工智能

2022-12-30

Cybercriminals are moving towards zero-day attacks affecting resource-constrained devices such as single-board computers (SBC). Assuming that perfect security is unrealistic, Moving Target Defense (MTD) is a promising approach to mitigate attacks by dynamically altering target attack surfaces. Still, selecting suitable MTD techniques for zero-day attacks is an open challenge. Reinforcement Learning (RL) could be an effective approach to optimize the MTD selection through trial and error, but the literature fails when i) evaluating the performance of RL and MTD solutions in real-world scenarios, ii) studying whether behavioral fingerprinting is suitable for representing SBC's states, and iii) calculating the consumption of resources in SBC. To improve these limitations, the work at hand proposes an online RL-based framework to learn the correct MTD mechanisms mitigating heterogeneous zero-day attacks in SBC. The framework considers behavioral fingerprinting to represent SBCs' states and RL to learn MTD techniques that mitigate each malicious state. It has been deployed on a real IoT crowdsensing scenario with a Raspberry Pi acting as a spectrum sensor. More in detail, the Raspberry Pi has been infected with different samples of command and control malware, rootkits, and ransomware to later select between four existing MTD techniques. A set of experiments demonstrated the suitability of the framework to learn proper MTD techniques mitigating all attacks (except a harmfulness rootkit) while consuming <1 MB of storage and utilizing <55% CPU and <80% RAM.

translated by 谷歌翻译

Studying Drowsiness Detection Performance while Driving through Scalable Machine Learning Models using Electroencephalography

José Manuel Hidalgo Rogel , Enrique Tomás Martínez Beltrán , Mario Quiles Pérez , Sergio López Bernal , Gregorio Martínez Pérez , Alberto Huertas Celdrán

分类：机器学习

2022-09-08

嗜睡是驾驶员和交通事故主要原因之一的主要关注点。认知神经科学和计算机科学的进步已通过使用脑部计算机界面（BCIS）和机器学习（ML）来检测驾驶员的嗜睡。然而，几个挑战仍然开放，应该面对。首先，文献中缺少使用一组ML算法的多种ML算法对嗜睡检测性能的全面评估。最后，需要研究适合受试者组的可扩展ML模型的检测性能，并将其与文献中提出的单个模型进行比较。为了改善这些局限性，这项工作提出了一个智能框架，该框架采用了BCIS和基于脑电图（EEG）的功能，以检测驾驶场景中的嗜睡。 SEED-VIG数据集用于喂食不同的ML回归器和三类分类器，然后评估，分析和比较单个受试者和组的表现最佳模型。有关单个模型的更多详细信息，随机森林（RF）获得了78％的F1分数，改善了通过文献中使用的模型（例如支持向量机（SVM））获得的58％。关于可扩展模型，RF达到了79％的F1得分，证明了这些方法的有效性。所学的经验教训可以总结如下：i）不仅SVM，而且文献中未充分探索的其他模型与嗜睡检测有关，ii）ii）适用于受试者组的可伸缩方法也有效地检测嗜睡，即使新受试者也是如此评估模型培训中未包括的。

translated by 谷歌翻译

Analyzing the impact of Driving tasks when detecting emotions through Brain-Computer Interfaces

Mario Quiles Pérez , Enrique Tomás Martínez Beltrán , Sergio López Bernal , Alberto Huertas Celdrán , Gregorio Martínez Pérez

分类：人工智能

2022-08-30

交通事故是年轻人死亡的主要原因，这一问题今天占了大量受害者。已经提出了几种技术来预防事故，是脑部计算机界面（BCIS）最有前途的技术之一。在这种情况下，BCI被用来检测情绪状态，集中问题或压力很大的情况，这可能在道路上起着基本作用，因为它们与驾驶员的决定直接相关。但是，在驾驶场景中，没有广泛的文献应用BCI来检测受试者的情绪。在这种情况下，需要解决一些挑战，例如（i）执行驾驶任务对情绪检测的影响以及（ii）在驾驶场景中哪些情绪更可检测到的情绪。为了改善这些挑战，这项工作提出了一个框架，该框架着重于使用机器学习和深度学习算法的脑电图检测情绪。此外，已经设计了两个场景的用例。第一种情况是聆听声音作为要执行的主要任务，而在第二种情况下，聆听声音成为次要任务，这是使用驱动模拟器的主要任务。这样，它旨在证明BCI在这种驾驶方案中是否有用。结果改善了文献中现有的结果，可在发现两种情绪（非刺激性和愤怒）中达到99％的准确性，三种情绪（非刺激性，愤怒和中立）的93％，四种情绪（非刺激）（非 - 刺激，愤怒，中立和喜悦）。

translated by 谷歌翻译

Robust Federated Learning for execution time-based device model identification under label-flipping attack

Pedro Miguel Sánchez Sánchez , Alberto Huertas Celdrán , José Rafael Buendía Rubio , Gérôme Bovet , Gregorio Martínez Pérez

分类：机器学习

2021-11-29

近年来经历的计算设备部署爆炸，由诸如互联网（物联网）和5G的技术（IOT）和5G等技术的激励，导致了全局情景，随着网络安全的风险和威胁的增加。其中，设备欺骗和模糊的网络攻击因其影响而脱颖而出，并且通常需要推出的低复杂性。为了解决这个问题，已经出现了几种解决方案，以根据行为指纹和机器/深度学习（ML / DL）技术的组合来识别设备模型和类型。但是，这些解决方案不适合数据隐私和保护的方案，因为它们需要数据集中处理以进行处理。在这种情况下，尚未完全探索较新的方法，例如联合学习（FL），特别是当恶意客户端存在于场景设置时。目前的工作分析并比较了使用基于执行时间的事件的v一体的集中式DL模型的设备模型识别性能。对于实验目的，已经收集并公布了属于四种不同模型的55个覆盆子PI的执行时间特征的数据集。使用此数据集，所提出的解决方案在两个设置，集中式和联合中实现了0.9999的精度，在保留数据隐私时显示没有性能下降。后来，使用几种聚集机制作为对策，评估标签翻转攻击在联邦模型训练期间的影响。 ZENO和协调明智的中值聚合表现出最佳性能，尽管当他们的性能大大降低时，当完全恶意客户（所有培训样本中毒）增长超过50％时大大降临。

translated by 谷歌翻译

Federated Learning for Malware Detection in IoT Devices

Valerian Rey , Pedro Miguel Sánchez Sánchez , Alberto Huertas Celdrán , Gérôme Bovet , Martin Jaggi

分类：机器学习

2021-04-15

这项工作调查了联合学习的可能性，了解IOT恶意软件检测，并研究该新学习范式固有的安全问题。在此上下文中，呈现了一种使用联合学习来检测影响物联网设备的恶意软件的框架。 n-baiot，一个数据集在由恶意软件影响的几个实际物联网设备的网络流量，已被用于评估所提出的框架。经过培训和评估监督和无监督和无监督的联邦模型（多层Perceptron和AutoEncoder）能够检测到MATEN和UNEEN的IOT设备的恶意软件，并进行了培训和评估。此外，它们的性能与两种传统方法进行了比较。第一个允许每个参与者在本地使用自己的数据局面训练模型，而第二个包括使参与者与负责培训全局模型的中央实体共享他们的数据。这种比较表明，在联合和集中方法中完成的使用更多样化和大数据，对模型性能具有相当大的积极影响。此外，联邦模型，同时保留了参与者的隐私，将类似的结果与集中式相似。作为额外的贡献，并衡量联邦方法的稳健性，已经考虑了具有若干恶意参与者中毒联邦模型的对抗性设置。即使使用单个对手，大多数联邦学习算法中使用的基线模型聚合平均步骤也很容易受到不同攻击的影响。因此，在相同的攻击方案下评估了作为对策的其他模型聚合函数的性能。这些职能对恶意参与者提供了重大改善，但仍然需要更多的努力来使联邦方法强劲。

translated by 谷歌翻译

Design and analysis of tweet-based election models for the 2021 Mexican legislative election

Alejandro Vigna-Gómez , Javier Murillo , Manelik Ramirez , Alberto Borbolla , Ian Márquez , Prasun K. Ray

分类：自然语言处理

2023-01-02

Modelling and forecasting real-life human behaviour using online social media is an active endeavour of interest in politics, government, academia, and industry. Since its creation in 2006, Twitter has been proposed as a potential laboratory that could be used to gauge and predict social behaviour. During the last decade, the user base of Twitter has been growing and becoming more representative of the general population. Here we analyse this user base in the context of the 2021 Mexican Legislative Election. To do so, we use a dataset of 15 million election-related tweets in the six months preceding election day. We explore different election models that assign political preference to either the ruling parties or the opposition. We find that models using data with geographical attributes determine the results of the election with better precision and accuracy than conventional polling methods. These results demonstrate that analysis of public online data can outperform conventional polling methods, and that political analysis and general forecasting would likely benefit from incorporating such data in the immediate future. Moreover, the same Twitter dataset with geographical attributes is positively correlated with results from official census data on population and internet usage in Mexico. These findings suggest that we have reached a period in time when online activity, appropriately curated, can provide an accurate representation of offline behaviour.

translated by 谷歌翻译

A Machine Learning Case Study for AI-empowered echocardiography of Intensive Care Unit Patients in low- and middle-income countries

Xochicale Miguel , Thwaites Louise , Yacoub Sophie , Pisani Luigi , Tran Huy Nhat Phung , Kerdegari Hamideh , King Andrew , Gomez Alberto

分类：机器学习

2022-12-30

We present a Machine Learning (ML) study case to illustrate the challenges of clinical translation for a real-time AI-empowered echocardiography system with data of ICU patients in LMICs. Such ML case study includes data preparation, curation and labelling from 2D Ultrasound videos of 31 ICU patients in LMICs and model selection, validation and deployment of three thinner neural networks to classify apical four-chamber view. Results of the ML heuristics showed the promising implementation, validation and application of thinner networks to classify 4CV with limited datasets. We conclude this work mentioning the need for (a) datasets to improve diversity of demographics, diseases, and (b) the need of further investigations of thinner models to be run and implemented in low-cost hardware to be clinically translated in the ICU in LMICs. The code and other resources to reproduce this work are available at https://github.com/vital-ultrasound/ai-assisted-echocardiography-for-low-resource-countries.

translated by 谷歌翻译

A Theoretical Framework for AI Models Explainability

Matteo Rizzo , Alberto Veneri , Andrea Albarelli , Claudio Lucchese , Cristina Conati

分类：人工智能 | 计算机视觉 | 机器学习

2022-12-29

Explainability is a vibrant research topic in the artificial intelligence community, with growing interest across methods and domains. Much has been written about the topic, yet explainability still lacks shared terminology and a framework capable of providing structural soundness to explanations. In our work, we address these issues by proposing a novel definition of explanation that is a synthesis of what can be found in the literature. We recognize that explanations are not atomic but the product of evidence stemming from the model and its input-output and the human interpretation of this evidence. Furthermore, we fit explanations into the properties of faithfulness (i.e., the explanation being a true description of the model's decision-making) and plausibility (i.e., how much the explanation looks convincing to the user). Using our proposed theoretical framework simplifies how these properties are ope rationalized and provide new insight into common explanation methods that we analyze as case studies.

translated by 谷歌翻译

Countering Malicious Content Moderation Evasion in Online Social Networks: Simulation and Detection of Word Camouflage

Álvaro Huertas-García , Alejandro Martín , Javier Huertas Tato , David Camacho

分类：自然语言处理 | 人工智能

2022-12-27

Content moderation is the process of screening and monitoring user-generated content online. It plays a crucial role in stopping content resulting from unacceptable behaviors such as hate speech, harassment, violence against specific groups, terrorism, racism, xenophobia, homophobia, or misogyny, to mention some few, in Online Social Platforms. These platforms make use of a plethora of tools to detect and manage malicious information; however, malicious actors also improve their skills, developing strategies to surpass these barriers and continuing to spread misleading information. Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems. In response to this recent ongoing issue, this paper presents an innovative approach to address this linguistic trend in social networks through the simulation of different content evasion techniques and a multilingual Transformer model for content evasion detection. In this way, we share with the rest of the scientific community a multilingual public tool, named "pyleetspeak" to generate/simulate in a customizable way the phenomenon of content evasion through automatic word camouflage and a multilingual Named-Entity Recognition (NER) Transformer-based model tuned for its recognition and detection. The multilingual NER model is evaluated in different textual scenarios, detecting different types and mixtures of camouflage techniques, achieving an overall weighted F1 score of 0.8795. This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content on social networks, making the fight against information disorders more effective.

translated by 谷歌翻译